Skip to content

Add plot --cpu modes, including ps-pcpu-timepoint for derived instantensous CPU #424

Open
asmacdo wants to merge 16 commits into
con:mainfrom
asmacdo:plot-pcpu-fix
Open

Add plot --cpu modes, including ps-pcpu-timepoint for derived instantensous CPU #424
asmacdo wants to merge 16 commits into
con:mainfrom
asmacdo:plot-pcpu-fix

Conversation

@asmacdo
Copy link
Copy Markdown
Member

@asmacdo asmacdo commented Apr 29, 2026

Summary

What this PR does is rework the plot so the data is presented honestly.
The old chart drew totals.pcpu and totals.rss as single lines with no per-pid context, so a single pid's bogus 5363% reading looked like an aggregate.
The new chart shows per-pid traces plus max/total envelopes, and introduces a --cpu mode flag:

  • --cpu ps-pcpu (default): plot ps's raw lifetime ratio untransformed. "Lossless" view; every point on the chart is an unaltered ps reading.
  • --cpu ps-cpu-timepoint: at plot time, derive a per-interval estimate from consecutive (pcpu, etime) pairs to approximate instantaneous CPU. Useful when lifetime averages inflate "current" usage.

Both modes draw:

  • A solid max-across-pids lower bound at each timestamp.
  • A dashed upper bound: total_* from the record (RSS, and CPU in ps-pcpu) or sum-across-pids of the derived values (CPU in ps-cpu-timepoint).

Other changes:

  • Drop pmem from the default chart; replace with absolute rss on a secondary axis (twinx).
    Host RAM is shown in the legend when info.json is alongside the usage file.
    Motivated by SLURM, where pmem = rss / host_total makes job usage look tiny on big nodes.
  • Harmonize byte humanization on base 1000 / "kB/MB/GB" suffixes between the formatter and plot paths.
  • Warn at runtime when --sample-interval < 1.0: ps reports etime as integer seconds, so pcpu calculations for sub-second-young pids are unstable.
  • Add docs/resource-statistics.md covering pcpu/rss semantics, how aggregation works, and how con-duct plot renders these. Linked from README.

Issue 399

Related:

The 5363% peak that triggered #399 came from a single pid in the ps data, where ps's cputime / etime calculation went racy on a sub-second-young pid.
That's a sampler-side problem; fixing it requires a different sampling strategy and is out of scope here.
Issue #399 should stay open against that work.

Demo

The branch was rebased to drop temporary demo commits before review.
The pre-rebase tip (bundled images, side-by-side tmp-plot-demo.md) is preserved on the plot-pcpu-fix-demos-2026-05-18 branch for reference.

demo file on demo branch

Caveats

  • Per-pid pcpu in records is max-across-samples within a report interval, not point-in-time.
    pcpu × etime is therefore an upper bound on cputime, not cputime itself, so the ps-cpu-timepoint delta math is approximate.
    Documented.
  • Linux-only math.
    --cpu ps-cpu-timepoint inverts the procps identity pcpu = cputime/etime, which assumes ps reports a cumulative lifetime ratio.
    macOS ps reports a 5/8-decayed EWMA, so the math is meaningless there.
    info.json doesn't record the host platform today; on macOS logs, ps-cpu-timepoint silently produces wrong numbers.
    Default ps-pcpu mode is unaffected.

Release

This PR includes a few changes that should be recorded in the changelog, but for convenience it is simpler not to break them into separate PRs.
Following the release, the release notes should be manually updated:

  • Change: con-duct plot now shows per-pid traces with max-across-pids and totals.* envelopes. Previously it drew only totals.pcpu and totals.rss as a single line per metric, with no per-pid breakdown.
  • Change: con-duct plot now shows absolute rss on a secondary y-axis. Motivated by SLURM, where pmem = rss / host_total is meaningless on shared nodes.
  • Add: host available RAM in the legend when info.json is available
  • Add: --cpu argument to con-duct plot:
    • --cpu ps-pcpu (default): plot ps's raw lifetime pcpu per pid, untransformed. Every point on the chart is an unaltered ps reading.
    • --cpu ps-cpu-timepoint: at plot time, derive a per-interval estimate from consecutive (pcpu, etime) pairs to approximate instantaneous CPU. Sidesteps lifetime-average inflation on short-lived bursty pids.
  • Change: con-duct run warns when --sample-interval is below 1.0s. ps reports etime as integer seconds, so pcpu calculations for sub-second-young pids are unstable.
  • Change: byte humanization in con-duct plot is now decimal (base 1000, kB/MB/GB), matching the run summary instead of the prior mismatched base-1024-with-decimal-suffix labels.
  • Docs: new docs/resource-statistics.md explains what duct's pcpu / rss / pmem actually measure, how aggregation works, and how con-duct plot renders these. Linked from README.

TODOs:

Fixes the 5363% peak in con#399.

- Previous plot summed totals.pcpu/totals.rss across pids per
  interval. Both sums double-count: multi-core for pcpu, shared
  pages for rss.
- Replace with per-pid lines. For pcpu, compute pdcpu
  (delta-corrected %CPU) at plot time from consecutive (etime,
  pcpu) pairs in usage.jsonl.
- Detect kernel pid reuse via Δetime ≈ Δwall (2s tolerance).
  Strict "etime regressed" misses cases where the new pid's etime
  crept above the old's (con#399 pid 3323259: 49s → 54s in 60s of
  wall = pid reuse).
- Clamp pdcpu < 0 to None. duct aggregates pcpu as
  max-across-samples but etime as the last sample, so a
  spike-then-idle pattern in the prior interval can push the
  cputime delta negative even on a continuous pid.
- Filter pids by "notable on either axis": peak pdcpu >= 0.5% OR
  peak rss >= 10MB. Cap legend at hybrid top-10 (top 5 by peak
  pdcpu unioned with top 5 by peak rss).
- vsz commented out by default.

Caveats:
- pcpu × etime is an upper bound on cputime under max-across-
  samples aggregation; pdcpu inherits the approximation.
- ~87% of pids in con#399's tox horde appear in only one record and
  don't get a pdcpu measurement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 85.52632% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.08%. Comparing base (f074bb4) to head (f284fb9).

Files with missing lines Patch % Lines
src/con_duct/plot.py 81.35% 10 Missing and 12 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #424      +/-   ##
==========================================
- Coverage   91.92%   91.08%   -0.85%     
==========================================
  Files          15       15              
  Lines        1127     1245     +118     
  Branches      140      170      +30     
==========================================
+ Hits         1036     1134      +98     
- Misses         69       77       +8     
- Partials       22       34      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@asmacdo
Copy link
Copy Markdown
Member Author

asmacdo commented Apr 29, 2026

Updated requirements from @yarikoptic review:

  • No per-pid color cycling, no cmd labels.
  • Per metric (pcpu orange, pmem blue): faint dotted per-pid traces + bold solid "envelope" at max-of-pids per timestamp ("minimum of total use") + dashed "envelope" at sum-of-pids ("maximum of total use" — may be unbounded for now).
  • Drop rss from default chart, drop top-N cap. Followup PRs can re-add rss/vsz as opt-in flags.
  • 3-entry linestyle legend: solid (max envelope) / dashed (sum envelope) / dotted (per-pid).

asmacdo added a commit to asmacdo/duct that referenced this pull request Apr 29, 2026
Per Yarik's review on PR con#424, replace per-pid colored lines + top-N
legend with per-metric color + max/sum envelope layout.

- All pcpu pid lines share one color (orange); all pmem lines share
  another (blue). Reviewers don't need to identify which pid was
  busy, just that something was.
- For each metric, plot two envelopes across the kept pids:
  - max-across-pids: lower bound on total resource use, solid. "If
    some pid was at 50%, the total was at least 50%."
  - sum-across-pids: upper bound, dashed. Can blow past 100% on
    multi-core (per-pid pdcpu doesn't know about cores) and overstate
    memory (shared pages get counted multiple times in pmem); both
    caveats are accepted.
- Drop rss from the default chart. peak_rss is still used as a
  relevance signal so memory-only pids contribute to the pmem cloud
  and envelope.
- Drop the top-N hybrid cap. With faint same-color dotted per-pid
  lines, a cloud of dozens-to-hundreds of traces reads as background
  texture through which the envelopes are clearly visible. The
  peak_pdcpu >= 0.5% OR peak_rss >= 10MB filter still suppresses
  noise.
- Two legends: metric color (top-left, orange/blue) and color-
  agnostic linestyle key (top-right): upper bound / lower bound /
  per-pid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@asmacdo
Copy link
Copy Markdown
Member Author

asmacdo commented Apr 29, 2026

I think we should consider adding a separate scale 0-100 for pmem on the right side, since pcpu is also percent but can go many times higher for multicore, which effectively squashes pmem. The issue with giving the second axis to pmem is that it would replace rss as the second y-axis.

We could

  • add an option to switch between pmem/rss
  • add a second plot for rss

wdyt?

@yarikoptic yarikoptic added the semver-patch Increment the patch version when merged label Apr 30, 2026
@yarikoptic
Copy link
Copy Markdown
Member

previews look much better! what about an alternative I believe I alluded to in the chat -- by default plot rss (not %mem) and state total physical ram in legend (if we know)? Then it would the right "Memory" with abs sizes up and we would be all set -- and I feel that for memory absolute value is good.

asmacdo added a commit to asmacdo/duct that referenced this pull request Apr 30, 2026
Per Yarik's second-pass review on PR con#424, replace pmem on the shared
y-axis with rss (absolute bytes) on a secondary y-axis.

Reasoning: under SLURM, pmem = rss / host_total, where host_total is
the whole node's physical memory rather than the cgroup's allocated
memory. A job using 100% of a 4GB request on a 256GB host therefore
shows ~1.5% pmem, which the shared y-axis squashes to invisibility.
Absolute rss avoids this entirely.

- pcpu (orange): primary y-axis (%), unchanged.
- rss (blue): secondary y-axis (bytes), formatted via the existing
  HumanizedAxisFormatter + _MEMORY_UNITS.
- Legend label is best-effort: if info.json is available -- passed
  directly, or as a sibling to the usage file via {prefix}info.json
  -- read system.memory_total and render "rss (host: X.XTB)".
  Otherwise fall back to plain "rss". Plot CLI still accepts a bare
  usage file; info.json remains optional.
- Filter unchanged: peak_pdcpu >= 0.5% OR peak_rss >= 10MB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@asmacdo
Copy link
Copy Markdown
Member Author

asmacdo commented Apr 30, 2026

Swap rss for pmem, done.
Adding host memory to the legend, done. (Note: if the info.json is available, we use it, otherwise we dont include. The demo includes one of each.)

Note: I fixed a bug caused by duplication. The Summary formatter used decimal humanization, the graph used binary but still used MB GB instead of MiB GiB. (Fixed by removing duplication, we now report decimal everywhere). I dont think this bugfix belongs in the PR, but is worth including during draft mode so we can review the plot changes together. I've noted removing this as a TODO in the description.

@yarikoptic
Copy link
Copy Markdown
Member

yarikoptic commented May 5, 2026

I slept on it: I think we should make it an explicit option which would allow to switch between plotting "original" pcpu which is a cumulative (and differently across OS as on OSX), vs our "time-point-estimate". So something like --cpu=ps-pcpu,ps-cpu-timepoint so if ps-cpu-timepoint -- we do our conversion estimate and otherwise just plotting original one (no transformation). When plotting, adjust label to become pcpu (time-point) to accent that it is our estimate. The rest stays exactly the same. WDYT?

edit: and in the gallery we will do both plots side -by- side to visualize the divergences.

@asmacdo
Copy link
Copy Markdown
Member Author

asmacdo commented May 5, 2026

@yarikoptic I agree. That interface seems reasonable. Im trying to imagine how this would play in the future when we may have multiple samplers, (ps regular, ps pdcpu calculated, /proc, psutil, whatever). The info.json could capture the sample type(s), so I think this would play nice with that, and default to the 1 on that was collected or force selection if more than sampling method one was selected.

asmacdo added a commit to asmacdo/duct that referenced this pull request May 6, 2026
Splits the per-pid cpu series into two paths in _build_pid_series:

- ps-pcpu (default): raw lifetime ratio from ps -o pcpu, no
  transformation. All records contribute; no entry is dropped.
- ps-cpu-timepoint: existing delta-corrected pdcpu pipeline,
  unchanged.

Y-axis label and color-legend swatch reflect the chosen mode.

Per Yarik's review on PR con#424: lossless raw cpu by default,
explicit opt-in to our derived time-point estimate. Future
samplers (psutil, /proc, ...) can extend the choices list.
asmacdo and others added 11 commits May 18, 2026 11:14
Per Yarik's review on PR con#424, replace per-pid colored lines + top-N
legend with per-metric color + max/sum envelope layout.

- All pcpu pid lines share one color (orange); all pmem lines share
  another (blue). Reviewers don't need to identify which pid was
  busy, just that something was.
- For each metric, plot two envelopes across the kept pids:
  - max-across-pids: lower bound on total resource use, solid. "If
    some pid was at 50%, the total was at least 50%."
  - sum-across-pids: upper bound, dashed. Can blow past 100% on
    multi-core (per-pid pdcpu doesn't know about cores) and overstate
    memory (shared pages get counted multiple times in pmem); both
    caveats are accepted.
- Drop rss from the default chart. peak_rss is still used as a
  relevance signal so memory-only pids contribute to the pmem cloud
  and envelope.
- Drop the top-N hybrid cap. With faint same-color dotted per-pid
  lines, a cloud of dozens-to-hundreds of traces reads as background
  texture through which the envelopes are clearly visible. The
  peak_pdcpu >= 0.5% OR peak_rss >= 10MB filter still suppresses
  noise.
- Two legends: metric color (top-left, orange/blue) and color-
  agnostic linestyle key (top-right): upper bound / lower bound /
  per-pid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Yarik's second-pass review on PR con#424, replace pmem on the shared
y-axis with rss (absolute bytes) on a secondary y-axis.

Reasoning: under SLURM, pmem = rss / host_total, where host_total is
the whole node's physical memory rather than the cgroup's allocated
memory. A job using 100% of a 4GB request on a 256GB host therefore
shows ~1.5% pmem, which the shared y-axis squashes to invisibility.
Absolute rss avoids this entirely.

- pcpu (orange): primary y-axis (%), unchanged.
- rss (blue): secondary y-axis (bytes), formatted via the existing
  HumanizedAxisFormatter + _MEMORY_UNITS.
- Legend label is best-effort: if info.json is available -- passed
  directly, or as a sibling to the usage file via {prefix}info.json
  -- read system.memory_total and render "rss (host: X.XTB)".
  Otherwise fall back to plain "rss". Plot CLI still accepts a bare
  usage file; info.json remains optional.
- Filter unchanged: peak_pdcpu >= 0.5% OR peak_rss >= 10MB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plot's _MEMORY_UNITS used 1024-base divisors with "KB/MB/GB/TB"
suffixes -- correct as KiB/MiB/GiB/TiB per IEC 80000-13, but the
suffixes claimed decimal. SummaryFormatter.naturalsize used a
parallel FILESIZE_SUFFIXES tuple at base 1000 with "kB/MB/GB"
prefixes. The data table was duplicated, the conventions disagreed,
and the plot axis lied about its base.

- Add module-level FILESIZE_UNITS in _formatter.py: single source of
  truth for byte-suffix data. Base 1000, suffixes B/kB/MB/GB/TB/PB/
  EB/ZB/YB.
- Refactor naturalsize to walk FILESIZE_UNITS instead of the local
  FILESIZE_SUFFIXES tuple. Drop FILESIZE_SUFFIXES. Behavior preserved
  (covered by test_summary_formatter_S_sizes).
- Fix the broken "%.3f" docstring example: actual output is
  "3.000 kB", not the "2.930 kB" left over from a 1024-base era.
- plot.py imports FILESIZE_UNITS directly (no alias), drops local
  _MEMORY_UNITS.
- Drop _format_bytes_compact (added in the previous commit) and
  route the rss legend label through SummaryFormatter().naturalsize
  for the same reason: avoid keeping a third byte-format helper.
- test_plot._MEMORY_UNITS parametrize cases switch to 1000-multiples
  with kB/MB/GB suffixes; drops the test_format_bytes_compact case.

Plot axis tick numbers shift slightly (e.g. "1.5MB" was 1.5 * 1024**2
bytes; same physical byte count now displays as "1.6 MB" since the
divisor is smaller). The displayed *meaning* is now correct.

Note: this commit is separable -- it can be cherry-removed and
shipped as its own PR for a distinct changelog entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously dropped pids whose peak pdcpu was <0.5% AND peak rss <10MB,
which rendered an empty plot for tiny/idle workloads (e.g. the gallery
sleep-loop sample). The per-pid lines are now dotted/faint and the
envelopes carry the signal, so the noise floor doesn't need trimming.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rss "upper bound" envelope was summing each pid's per-interval peak
rss. When a workload spawns or thrashes processes within a report
interval, the per-pid peaks fall in different samples and never coexist,
so summing them invents memory pressure that wasn't there. On bursty
workloads (e.g. mriqc) this padded the line by ~2-3 GB on top of the
true measured concurrent peak.

duct already records the measured peak concurrent rss per report at
sample-time as totals.rss (max-of-sum-per-sample, see _models.py). Read
that directly for the upper-bound line. Within "observed samples only"
framing it's a true upper bound on observed concurrent rss with no
phantom coexistence.

pcpu's upper bound (sum-of-pdcpu) is unchanged. totals.pcpu is unusable
because it carries forward the lifetime-ratio masking that motivated
con#399, and there's no unified etime to delta-invert it against.
The pcpu/rss upper-bound semantics now genuinely diverge, so the shared
envelope loop is split.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure rename, no behavior change. Prepares for an upcoming --cpu
mode flag where the same series can hold either raw ps pcpu or the
delta-corrected pdcpu.
Splits the per-pid cpu series into two paths in _build_pid_series:

- ps-pcpu (default): raw lifetime ratio from ps -o pcpu, no
  transformation. All records contribute; no entry is dropped.
- ps-cpu-timepoint: existing delta-corrected pdcpu pipeline,
  unchanged.

Y-axis label and color-legend swatch reflect the chosen mode.

Per Yarik's review on PR con#424: lossless raw cpu by default,
explicit opt-in to our derived time-point estimate. Future
samplers (psutil, /proc, ...) can extend the choices list.
In ps-pcpu mode the per-pid value is ps's cumulative lifetime ratio,
which procps is documented to compute inaccurately for short-lived
multi-threaded processes (mutually inconsistent reads of utime,
stime, starttime, and uptime; no atomic snapshot). Single-pid
readings can exceed physical maxima (e.g. pcpu=5347 on a 20-core
host in con#399). Summing those across pids -- already inflated by
phantom coexistence on top -- produces an upper-bound line that's
neither a faithful reading of ps nor a meaningful bound. The
per-pid trace cloud and max-across-pids lower bound carry the
visual signal; rss's totals.rss-based dashed upper bound is
unaffected.

In ps-cpu-timepoint mode the sum envelope is unchanged (per-pid
pdcpu is delta-corrected and the negative-pdcpu clamp filters out
the worst aggregation-timing artifacts).
Restores a dashed cpu upper-bound line in ps-pcpu mode, mirroring
the rss path: per-record totals.pcpu is duct's max-of-(sum-per-
sample) within each report interval, i.e. the peak concurrent
total pcpu observed at any single sub-sample. That's a tight
bound under "observed samples only" framing and avoids the
phantom-coexistence overcount of summing per-pid maxes.

ps-cpu-timepoint mode keeps its sum-of-pdcpu envelope (no
per-record totals.pdcpu exists -- pdcpu is computed at plot time).

_totals_rss_series generalised to _totals_series(data, field) so
both metrics share the helper.
ps reports etime as integer seconds, so pcpu calculations for
sub-second-young pids are unstable. Sample intervals below 1.0s
therefore produce erratic numbers. Emit a runtime warning and
document the constraint in the --sample-interval helptext. See
docs/resource-statistics.md for details.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New docs/resource-statistics.md covers what duct's pcpu, rss, and
pmem actually measure (lifetime-average pcpu, shared-page rss),
how aggregation works, and how con-duct plot renders these into
per-pid traces with max/total envelopes. Adds a Documentation
section to README pointing at it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@asmacdo asmacdo changed the title Exploratory plot fix Add plot --cpu modes, including ps-pcpu-timepoint for derived instantensous CPU May 19, 2026
@asmacdo asmacdo marked this pull request as ready for review May 19, 2026 15:50
Copilot AI review requested due to automatic review settings May 19, 2026 15:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks con-duct plot to present CPU/RSS data with per-PID context (plus envelope bounds), adds a --cpu mode selector for raw vs derived CPU series, and updates byte humanization and documentation to better explain resource-stat semantics.

Changes:

  • Replaces aggregate-only plot lines with per-pid traces plus max/upper-bound envelopes; adds --cpu {ps-pcpu, ps-cpu-timepoint}.
  • Harmonizes byte humanization to decimal (base-1000) across summary output and plotting.
  • Adds a runtime warning for --sample-interval < 1.0 and new documentation explaining CPU/RSS semantics and aggregation caveats.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/con_duct/plot.py Implements per-pid trace plotting, envelope computation, host RAM legend augmentation, and --cpu mode behavior.
src/con_duct/cli.py Wires the new --cpu option into the plot subcommand.
src/con_duct/_utils.py Adds helpers for parsing etime, detecting PID reuse, and deriving interval CPU from (pcpu, etime) pairs.
src/con_duct/_formatter.py Introduces FILESIZE_UNITS and updates naturalsize to use consistent decimal units.
src/con_duct/_duct_main.py Adds a warning when --sample-interval is below 1.0s.
test/test_plot.py Updates plot tests for the new cpu argument and adds unit/series tests plus host-memory lookup tests.
test/duct_main/test_duct_utils.py Adds tests for etime_to_etimes, is_same_pid, and pdcpu_from_pcpu.
README.md Links to the new resource-statistics documentation.
docs/resource-statistics.md Adds detailed explanation of sampling/aggregation semantics and plot rendering behavior.
Comments suppressed due to low confidence (4)

docs/resource-statistics.md:32

  • These examples reference stats[A].rss and total_rss, but the actual usage records use processes for per-pid stats and totals.rss for the aggregated total. Aligning the example paths with the emitted JSON shape will make the doc actionable when readers inspect usage.jsonl.
2. **Per-pid and session-total peaks may come from different sample moments.**
   Per-pid max-reduction and total max-reduction are independent.
   The same record can have `stats[A].rss = X` (A's peak from one sub-sample) and `total_rss = Y` (the peak simultaneous total from another sub-sample).

docs/resource-statistics.md:127

  • This says ps -o rss is reported "in kilobytes", but duct stores RSS in bytes (it multiplies the ps value by 1024 in _sampling.py, and ProcessStats.rss is documented as bytes). Consider clarifying: ps outputs KiB, duct converts to bytes in usage.jsonl/info.json, and the plot axis humanizes bytes.
## Memory — `rss` and `pmem`

`ps -o rss` reports per-process **resident set size**: physical memory currently mapped into the process's address space, in kilobytes.
This counts:

src/con_duct/plot.py:300

  • matplotlib_plot assumes args.cpu exists; when called programmatically (or from tests that construct argparse.Namespace manually) this becomes an AttributeError that is caught by the broad except Exception and reported as a generic processing error. To keep behavior consistent with the CLI default, consider using getattr(args, "cpu", CPU_MODE_PS_PCPU) (and optionally validating it) before calling _build_pid_series.
    try:
        pid_series = _build_pid_series(data, cpu_mode=args.cpu)
        totals_rss_xs, totals_rss_ys = _totals_series(data, "rss")
        totals_pcpu_xs, totals_pcpu_ys = _totals_series(data, "pcpu")
    except KeyError as e:

src/con_duct/plot.py:120

  • In ps-cpu-timepoint mode, a ValueError from etime_to_etimes triggers continue, which drops the entire pid observation for that record (including RSS/PMEM) and also skips updating pid_state. Consider instead keeping the record (append cpu=None but still append rss/pmem) and resetting the pid baseline/state when etime is missing/unparseable, so the next delta isn’t computed from stale state.
                try:
                    etime_sec = etime_to_etimes(p.get("etime", ""))
                except ValueError:
                    continue

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/resource-statistics.md
Comment thread src/con_duct/plot.py Outdated
asmacdo and others added 3 commits May 19, 2026 11:04
=== Do not change lines below ===
{
 "chain": [],
 "cmd": "./.update-readme-help.py",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
The usage record schema nests session totals under a `totals` object
(`totals.rss`, `totals.pcpu`, etc. per `Sample.for_json`), not flat
`total_rss` / `total_pcpu`. Update the doc to match so readers don't
look for non-existent keys.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The docstring still described the older shape: a "pdcpu cloud" with
max/sum envelopes. After the --cpu mode flag and the totals.* upper
bound, neither is accurate. Rewrite to describe the per-pid CPU/rss
cloud and the per-mode envelope sources.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@yarikoptic yarikoptic added semver-minor Increment the minor version when merged UX and removed semver-patch Increment the patch version when merged labels May 19, 2026
@asmacdo
Copy link
Copy Markdown
Member Author

asmacdo commented May 19, 2026

@yarikoptic yarikoptic added the release Create a release when this pr is merged label May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release Create a release when this pr is merged semver-minor Increment the minor version when merged UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants